首页> 外文OA文献 >Different approaches for identifying important concepts in probabilistic biomedical text summarization
【2h】

Different approaches for identifying important concepts in probabilistic biomedical text summarization

机译:识别概率中重要概念的不同方法   生物医学文本摘要

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Automatic text summarization tools help users in biomedical domain to acquiretheir intended information from various textual resources more efficiently.Some of the biomedical text summarization systems put the basis of theirsentence selection approach on the frequency of concepts extracted from theinput text. However, it seems that exploring other measures rather than thefrequency for identifying the valuable content of the input document, andconsidering the correlations existing between concepts may be more useful forthis type of summarization. In this paper, we describe a Bayesian summarizerfor biomedical text documents. The Bayesian summarizer initially maps the inputtext to the Unified Medical Language System (UMLS) concepts, then it selectsthe important ones to be used as classification features. We introducedifferent feature selection approaches to identify the most important conceptsof the text and to select the most informative content according to thedistribution of these concepts. We show that with the use of an appropriatefeature selection approach, the Bayesian biomedical summarizer can improve theperformance of summarization. We perform extensive evaluations on a corpus ofscientific papers in biomedical domain. The results show that the Bayesiansummarizer outperforms the biomedical summarizers that rely on the frequency ofconcepts, the domain-independent and baseline methods based on theRecall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics. Moreover,the results suggest that using the meaningfulness measure and considering thecorrelations of concepts in the feature selection step lead to a significantincrease in the performance of summarization.
机译:自动文本摘要工具可帮助生物医学领域的用户更有效地从各种文本资源中获取所需的信息。某些生物医学文本摘要系统将其句子选择方法的基础放在从输入文本中提取概念的频率上。但是,对于这种类型的总结来说,探索其他方法而不是识别输入文档中有价值的内容的频率,并考虑概念之间存在的相关性似乎更有用。在本文中,我们描述了生物医学文本文档的贝叶斯摘要器。贝叶斯摘要器首先将输入文本映射到统一医学语言系统(UMLS)概念,然后选择要用作分类特征的重要文本。我们引入了不同的特征选择方法来识别文本中最重要的概念,并根据这些概念的分布选择内容最丰富的内容。我们表明,使用适当的功能选择方法,贝叶斯生物医学摘要器可以提高摘要的性能。我们对生物医学领域的一整套科学论文进行了广泛的评估。结果表明,贝叶斯摘要生成器的性能优于生物医学摘要生成器,后者依赖于概念的频率,基于域的独立性和基于面向召回评估的未成年人评估(ROUGE)指标的基线方法。此外,结果表明,在特征选择步骤中使用有意义性度量并考虑概念的相关性会导致摘要性能的显着提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号